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ABSTRACT 

The article offers the educational information quantization method for improving content 
quality in Learning Management Systems. The paper considers questions concerning 
analysis of quality of quantized presentation of educational information, based on 
quantitative text parameters: average frequencies of parts of speech, used in the text; 
formal text readability indexes; lexical and syntactic text variety factors. The process of 
obtaining quantitative parameter values is focused on use of the phpMorphy 
morphological analysis library. 
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INTRODUCTION 

Development of educational content preparation tools lags behind Learning Management 
Systems (LMS). Success of LMS, in its turn, depends on quality and effective organization 
of educational content. The current LMSs, such as Moodle, Ilias, Claroline, Atutor, etc., do 
not allow developers of e-learning courses to assess educational content quality. At the 
same time, educational content assessment is aimed at determining advantages and 
disadvantages of educational information and at making the decision on possibility and 
optimum conditions of its use in e-learning. One of the directions in solving the problem 
of assessing educational content quality in LMSs is quantitative linguistics methods. 

PROBLEM STATEMENT 

Quality and effective organization of educational content influence directly the following 
LMS parameters (A. A.Rybanov, 2011): 
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Educational content mastering factor {K ) is a ratio of the educational 
content mastered by LMS users during a certain time unit to the content 
provided to the users during this time unit: 



Here / e is the mastered content; I 0 is the provided content. If the same 
content has been mastered by the users during various times, the factor K 
should be divided by the timef. To measure I 0 and 7 0 , comparative 
analysis of the user thesaurus and the educational content thesaurus can 
be used (A.A.Rybanov, 2013). 


> Educational content mastering speed or a ratio of the mastering factor to 
mastering time: 

K - 

' t 

avg 

Here K ,• is the relative learning time factor; t, is the time spent by / -th 
LMS user for mastering a certain educational content; f avg is the average 

time spent for mastering a certain educational content by a group of LMS 
users. 


> 


Educational content mastering retention shows the level of LMS user's 
knowledge and skills after some time after e-learning course completion: 


a m 



Here I 0 is the provided content; I m is the educational content retained 
and effectively used by the user after some timef . 


LMS educational content development includes development of content preparation 
technologies, such as educational information quantization (V.S.Avanesov, 2012). K , K ir 
and o m factors depend on, inter alia, educational information quantization quality. An 
important problem is forming a quantitative criteria system for assessing educational 
information quantization quality. 

MATHEMATICAL FORMULATION 


Concept of Educational Information Quantization Process 

Quantization is dividing of educational information into different purpose (information, 
training, controlling, and managing) elementary fragments (educational units, steps, 
frames) that facilitates mastering the sense contained in each educational information 
fragment. Volume of the text information contained in these fragments must be limited. 
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Quantization process is a transformation: 


Here T = (7} | / = \,n) is the educational information intended for quantization, 7~ is 

logically complete fragment of the educational information T ; V = {Tj | / = l,n) is 
quantized presentation of the educational information, T- is an educational information 
quantum associated with the fragment 7}. 

System educational information quantization principle assumes taking into account the 
following regularities: 

> Large volume educational information is remembered hardly; 

> Educational information, which is presented compactly and according to a 
certain system, is perceived better; 

> Emphasizing sense units in the educational information promotes effective 
memorizing. 

Taking into account that the educational information quantum Tj must contain the most 
informative part of the fragment7}, requirements to the educational information 
quantum can be formulated as follows: 


> Educational information quantum Tj must have a lower redundancy and a 
higher entropy than T ); 

> Educational information quantum T/ must be smaller by volume than the 
corresponding educational information fragment T): \ Tj \ < \ T, \. 

The process of constructing quantum Tj for the educational information fragment 7} by 
the teacher consists of the following stages: 

> Preparation stage (reading and comprehension of the educational 
information fragment 7}); 

> Analytical stage (highlighting of the main semantic units (sentences, 
words, phrases), construction of the quantum Tj structure for the 
educational information fragment 7~ ); 

> The stage of constructing quantum T- for the educational information 
fragment 7} (the units highlighted earlier are placed in the common 
secondary text according to the quantum Tj structure). 
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Semantic units of the quantum Tj for the educational information fragment T) can be: 

> Yi: the full (without changes) key sentence of the initial text T ); 

> Y 2 : the paraphrased key sentence of the initial text T ); 

> Y 3 : the sentence constructed of the key words and phrases of the initial 
text Tj ; 

> Y 4 : the sentence generalizing several sentences of the initial text T ). 

Quality quantized educational texts ensure understanding of the educational material by 
most students, because division of the material into parts reduce noticeably the volume 
of directly perceived information and the volume of senses in each fragment, thereby 
improving understandability of senses of the entire educational text. 

Besides, work with test items for such texts ensures mastering of each text's content. 

Quantitative Characteristics of Educational Information 

Quantitative linguistics is one of the applied linguistics' areas in which language is 
studied by means of statistical methods (Keith Johnson, 2008). 

Advantage of quantitative text studying methods is their accuracy and unambiguity of 
the results. Calculation of quantitative text characteristics is necessary for solving the 
following tasks: 

> Determining style and genre characteristics of the texts with the purpose 
of their subsequent classification (J.Tuldava, 2004); 

> Examination of text samples with the purpose of establishing authorship 
(J.Grieve, 2007); 

> Speciality language teaching (V.V.Ageev, V.M.Sergevnina, E.I.Yakovleva, 

2011 ). 

One of the content preparation technology problems is forming the system of 
quantitative criteria for assessing educational information quantization quality. 
Quantitative text characteristics can form a basis of this criteria system. 


O. A. Wiio suggested using quantitative characteristics for assessing the complexity 
factor (O. A. Wiio, 1968), the more adjectives and adverbs in the text, the higher is the 
text complexity. Verb is the liveliest part of speech. Frequent using of verbs in 
conjugation forms results in easier remembering and understanding of the sentences. In 
such sentences, related words are close to each other and their relations are perceived 
easily. Verbs promote text understanding (R.FIesh, 1946). 


The problem of automated determining of quantitative text characteristic values is 
important. 306 


Software realization of the automated determining of some quantitative text 
characteristics is possible on the basis of the PHP based phpMorphy morphological 
analysis library ( http://phpmorphv.sourceforqe.net) . 

The phpMorphy library supports processing of texts in Russian, English, and German. The 
library is aimed at solving the following tasks: 

> Lemmatization (obtaining normal word form); 

> Obtaining all word forms; 

> Obtaining semigrammatical information on the word (part of speech, 
case, conjugation, etc.); 

> Changing the word form according to the set grammatical 
characteristics; 

> Changing the word form according to the set pattern. 

Among great number of quantitative text characteristics, let us consider the following 
ones: 


> Quantitative characteristics of used parts of speech; 

> Quantitative text readability characteristics; 

> Quantitative text variety characteristics. 

By means of the phpMorphy library, the following low-level quantitative text 
characteristics calculated on the basis of average frequencies of parts of speech used in 
the text can be determined: 

> Anaiyticity index is a ratio of the function word quantity to the total 
word quantity in the text; 

> Verb index is a ratio of verb quantity to the total word quantity in the 
text; 

> Substantive index is a ratio of noun quantity to the total word quantity 
in the text; 

> Adjective index is a ratio of adjective quantity to the total word 
quantity in the text; 

> Pronoun index is a ratio of pronoun quantity to the total word quantity 
in the text; 

> Autosemanticity index is a ratio of meaningful word quantity to the 
total word quantity in the text; 

> Unmomentous word index is a ratio of unmomentous word quantity to 
the total word quantity in the text; 

> Nominal lexicon index is a ratio of the total noun and adjective quantity 
to the total word quantity in the text. 


Part of speech designations in the phpMorphy library are presented in Table: 1. 
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Table: 1 

Part of speech designations in the phpMorphy library 


Constant 


Description 


PMY_RP_NOUN 

PMY_RP_ADJ_FULL 

PMY_RP_ADJ_SHORT 

PMY_RP_INFINITIVE 

PMY_RP_VERB 

PMY_RP_ADVERB_PARTICIPLE 

PMY_RP_PARTICIPLE 

PMY_RP_PARTICIPLE_SHORT 

PMY_RP_NUMERAL 

PMY_RP_NUMERAL_P 

PMY_RP_PRONOUN 

PMY_RP_PRONOUN_PREDK 

PMY_RP_PRONOUN_P 

PMY_RP_ADV 

PMY_RP_PREDK 

PMY_RP_PREP 

PMY_RP_CONJ 

PMY_RP_INTERJ 

PMY_RP_PARTICLE 

PMY_RP_INP 

PMY RP PHRASE 


Noun 
Adjective 
Short adjective 
Infinitive 

Verb in the personal form 

Adverbial participle 

Participle 

Short participle 

Numeral 

Ordinal numeral 

Pronoun-noun 

Pronoun-predicative 

Pronominal adjective 

Adverb 

Predicative 

Preposition 

Conjunction 

Interjection 

Particle 

Parenthesis 

Phraseological unit _ 


Low-level quantitative text characteristics can be expressed through the part of speech 
designations in the phpMorphy library as follows (COUNT_WORDS is the total word 
quantity in the text): 

> Analyticity index : 

Analyticityjndex = (PMY_RP_PREP + PMY_RP_CONJ + 

+ PMY_RP_PARTICLE) / COUNT_WORDS. 

> Verb index : 

Verbjndex = (PMY_RP_INFINITIVE + PMY_RP_VERB + 

+ PMY_RP_ADVERB_PARTICIPLE + PMY_RP_PARTICIPLE + 

+ PMY_RP_PARTICIPLE_SHORT) / COUNT_WORDS. 

> Substantive index: 

SubstantiveJndex = PMY_RP_NOUN / COUNT_WORDS. 

> Adjective index : 

Adjectivejndex = (PMY_RP_ADJ_FULL + 

+ PMY_RP_ADJ_SHORT) / COUNT_WORDS. 

> Pronoun index: 

Pronoun Jndex = (PMY_RP_PRONOUN + PMY_RP_PRONOUN_PREDK + 

+ PMY_RP_PRONOUN_P) / COUNT_WORDS. 
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> Autosemanticity index : 

Autosemanticity_index = 1 - Unmomentous_words_index. 

> Unmomentous word index: 

Unmomentous_ wordindex = ((PMY_RP_PREP + PMY_RP_CONJ + 

+ PMY_RP_PARTICLE) + (PMY_RP_PRONOUN + 

PMY_RP_PRONOUN_PREDK + PMY_RP_PRONOUN_P)) / COUNT_WORDS. 

> Nominal lexicon index : 

Nominaijexiconjndex = (PMY_RP_NOUN + PMY_RP_ADJ_FULL + 

+ PMY_RP_ADJ_SHORT) / COUNT_WORDS. 

Among quantitative text readability characteristics, the following characteristics can be 
highlighted: average word length in syllables and average sentence length in words. 
These characteristics are statistical parameters, which are used in the formulas for 
assessing readability and are necessary for calculating the formal readability index. 
These parameters can be easily expressed quantitatively and are suitable for automatic 
assessment. 

Quantitative text variety characteristics are described by the lexical and syntactic variety 
factors. Since factor in not an absolute, but a relative value (within a certain value 
range), compared texts' lengths can be neglected within certain limits. Researching of 
the internal educational text "dynamics" in relation to comparing the factors in different 
parts of the text and their ratios to the general factor for the entire text is of theoretical 
interest as well. 


The lexical variety factor is a ratio of lexeme quantity to the total word quantity in the 
text: 

/fi ex = —, ( 1 ) 

lex w 

Here is the lexical variety factor; L is lexeme (word form) quantity in the text; W is 
the total word (the units between blanks) quantity in the text. The higher the K [ex value, 
the higher is the lexical variety of the text. 


The syntactic variety factor is a ratio of the total sentence quantity to the total word 
quantity in the text: 


Here K syn 
text. 


K _i ^ 

A syn -1 


( 2 ) 


^ ^ w 

is syntactic variety factor; 5 is sentence quantity; W is word quantity in the 


The higher the K s ^ value, the wordier are the sentences in the text in general, and, 

therefore, the higher the possibility of the variety of syntactic relations between words in 
a separate sentence. 
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Measuring Quantitative Characteristics of the Educational Information 
When processing a text automatically, there can be a situation when the part of speech 
determining function returns several values for one word form. For example, for the word 
'PROGRAM', the getPartOfSpeech function in the phpMorphy library returns the following 
array with part of speech values: 

var_dump ($morphy-> getPartOfSpeech (' PROGRAMM')); 

// array ('NOUN', 'ADJECTIVE', 'VERB') 

Therefore the value of each quantitative text characteristic must be described by its 
calculation error value. 

Let us set the following designations for the process of automatic calculation of word 
quantity in the text T , relating to the part of speech k : 

> r\ k is quantity of single-value determinations of the part of speech k ; 

> |Ja is quantity of multiple-value determinations of the part of speech k . 

> 0 A . is word quantity in the part of speech k in the text T . 

Part of speech probability distribution in the text T is unknown. Therefore, according to 
the Laplace's principle of insufficient reason, in automatic recognizing of the parts of 
speech, there are no reasons to consider them to be different. 

According to the principle of insufficient reason, let us assume that 


Ha + — ®a — Ha + Ma ' ^a ■ 


From there, let us assume that 

9a = Ha + Ma/2 ■ 

Then the absolute error A k in automatic determining of the part of speech k : 

Aa = Ma ■ 

And the relative error 5 k in automatic determining of the part of speech k : 

5 , = ■ 100 % =-^-■ 100 % . 

®A 2 t|a+Ma 

On the basis of the values A k and b k , let us calculate the errors for automatic 
determining the value of the quantitative characteristic [3 for the text T : 


> 

> 


Absolute error : A p = 


1 

2 W 


/eA 


Relative error 


Em, 

_ i^P _ 

p " 2 -Zn, + 2 >' 

/'eP /eP 
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Here P is a set of parts of speech, used in calculating the quantitative characteristic [3. 
For example, adjective index errors are calculated as follows: 

j. Mpmy_rp_adj_full + Mpmy_rp_adj_short 


-'Adjectivejndex 


2'W 

Mpmy rp adj full + Mpmy rp adj short 


J Adjectivejndex 


2 1 (Hpmy rp adj full + Hpmy rp adj short) + Mpmy rp adj full + Mpmy rp adj short 


Formal Readability Index for Educational Information 

The works by G.Hargis (2000), W.H.DuBay (2004), R.H.Hall and Hanna P. (2004) define 
the following element groups influencing readability: content, style, format, features of 
organization. It is necessary to distinguish between formal text readability (S. Cepni, M. 
Gokdere, M. Kucuk, 2002) /? form (/), which is a function of parameters of the educational 

content / itself only, and individual text readability R md (I,u) , which depends both on 
characteristics of the educational content I and on properties of the reader u . 

For quantitative formal readability assessment, it is possible to use the indexes offered in 
the works by J. Tuldava (1975) and R. Flesh (1974). J. Tuldava 's index is calculated 
according to the formula: 

fi(/ 7 ,7) = / 7 -igy', (3) 

Here /?(/, j) is formal readability index (Figure: 1),/ is average word length in syllables, 
j is average sentence length in words. The formula (3) is developed on the basis of the 
regularity observed in various languages. Therefore J. Tuldava's formula is intended for 
analyzing texts in different languages. The lower the value /?(/, j ), the better is the text 
perception. 



Figure: 1 

The kind of the function /?(/, j) 
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R.FIesh's index is calculated according to the formula: 

Fr{i, j) = 206.835 - j - a 2 ' /, (4) 

Here a lr a 2 are the language dependent factors (for English, a : =1.015 , a 2 =84.6 ; for 
Russian, (^=1.3, a 2 =60.1). Correspondence between R.FIesh's index values and the 
linguistic variables "Readability level" and "Educational level" is shown in Table: 2. 

Table: 2 

Linguistic variables "Readability level" and "Educational level" 
for the R.FIesh's index Fr{i, j) 


R.FIesh's index Fr(i , j) Readability level 

Educational level 

90-100 

Very high 

5th grade 

80-90 

High 

6th grade 

70-80 

Above the average 

7th grade 

60-70 

Average 

8th - 9th grades 

50-60 

Below the average 

10th - 12th grades 

30-50 

Low 

College 

0-30 

Very low 

Graduate 


RESULTS AND DISCUSSION 

A.P.Chekhov's story "The White Forehead Puppy" has been used as an experimental 
material. Analysis of the educational information quantization quality has been carried 
out on the basis of the two story's presentations: T - the initial (original) text, and T' - 
quantized text. 

The initial text T has been divided into seven logically complete 
fragments T = (7} | / = 1,7), each of which was quantized. 

The obtained quantized text is also a set of seven logically complete 
fragmentsr' = (r/|/ = l,7); here Tj is a quantized text fragment obtained as a result of 
fragment T) quantization. 

Results of automatic part of speech recognition carried out with the use of the 
phpMorphy library in the initial T and quantized T' texts are presented in Table: 3 and 
Table: 4. 
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Table: 3 

Results of part of speech recognition in the initial text T 


Part of speech k 

n k 

Vk 

CD 

A* 

5. 

PMY_RP_NOUN 

429 

138 

498.0 

69.0 

13.855 

PMY_RP ADJ_FULL 

102 

40 

122.0 

20.0 

16.393 

PMY_RP_ADJ_SHORT 

5 

66 

38.0 

33.0 

86.842 

PMY_RP_INFINITIVE 

36 

5 

38.5 

2.5.0 

6.494 

PMY_RP_VERB 

285 

59 

314.5 

29.5 

9.380 

PMY_RP_ADVERB_PARTICIPLE 

37 

4 

39.0 

2.0 

5.128 

PMY_RP_PARTICIPLE 

15 

4 

17.0 

2.0 

11.765 

PMY_RP_PARTICIPLE_SHORT 

3 

2 

4.0 

1.0 

25 

PMY_RP_NUMERAL 

9 

9 

13.5 

4.5 

33.333 

PMY_RP_NUMERAL_P 

0 

3 

1.5 

1.5 

100 

PMY_RP_PRONOUN 

97 

98 

146.0 

49.0 

33.562 

PMY_RP_PRONOUN_PREDK 

0 

0 

0.0 

- 

- 

PMY_RP_PRONOUN P 

28 

69 

62.5 

34.5 

55.200 

PMY_RP ADV 

44 

219 

153.5 

109.5 

71.336 

PMY_RP_PREDK 

0 

32 

16.0 

16.0 

100 

PMY_RP_PREP 

203 

35 

220.5 

17.5 

7.937 

PMY_RP_CONJ 

1 

254 

128.0 

127.0 

99.219 

PMY_RP_INTERJ 

0 

170 

85.0 

85.0 

100 

PMY_RP_PARTICLE 

28 

120 

88.0 

60.0 

68.182 

PMY_RP_INP 

0 

4 

2.0 

2.0 

100 

PMY_RP_PHRASE 

0 

4 

2.0 

2.0 

100 


Table: 4 

Results of part of speech recognition in the quantized text T' 


Part of speech k 

n k 

Vk 

CD 

A, 

5. 

PMY_RP_NOUN 

216 

61 

246.5 

30.5 

12.373 

PMY_RP_ADJ_FULL 

46 

16 

54.0 

8.0 

14.815 

PMY_RP ADJ_SHORT 

3 

33 

19.5 

16.5 

84.615 

PMY_RP_INFINITIVE 

20 

2 

21.0 

1.0 

4.762 

PMY_RP_VERB 

140 

32 

156 

16.0 

10.256 

PMY_RP_ADVERB_PARTICIPLE 

14 

1 

14.5 

0.5 

3.448 

PMY_RP_PARTICIPLE 

8 

2 

9.0 

1.0 

11.111 

PMY_RP_PARTICIPLE_SHORT 

0 

0 

0.0 

- 

- 

PMY_RP_NUMERAL 

5 

4 

7.0 

2.0 

28.571 

PMY_RP_NUMERAL P 

0 

1 

0.5 

0.5 

100 

PMY_RP_PRONOUN 

43 

54 

70.0 

27.0 

38.571 

PMY_RP_PRONOUN_PREDK 

0 

0 

0.0 

- 

- 

PMY_RP_PRONOUN P 

16 

37 

34.5 

18.5 

53.623 

PMY_RP_ADV 

29 

102 

80.0 

51.0 

63.750 

PMY_RP_PREDK 

0 

14 

7.0 

7.0 

100 

PMY_RP_PREP 

103 

22 

114.0 

11.0 

9.649 

PMY_RP_CON3 

0 

126 

63.0 

63.0 

100 

PMY_RP_INTERJ 

0 

83 

41.5 

41.5 

100 

PMY_RP_PARTICLE 

15 

64 

47.0 

32.0 

68.085 

PMY_RP_INP 

0 

2 

1.0 

1.0 

100 

PMY_RP_PHRASE 

0 

1 

0.5 

0.5 

100 
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Percentagewise, discrepancy in word distributions among parts of speech for the initial 
(Table: 3) and the quantized (Table: 4) texts are insignificant (Figure: 2). 



Figure: 2 

Comparative analysis of relative word distributions among parts of speech for the initial 

and the quantized texts 

Values and errors of calculating quantitative part of speech characteristics for the initial 
and the quantized texts are presented in Table: 5. 

Table: 5 

Values and errors of calculating quantitative part of speech characteristics 


Quantitative 
Characteristic (3 

Initial text T 


Quantized text T' 

Value 



Value 



Analyticity index 

.229 

.107 

46.849 

.236 

.112 

47.321 

Verb index 

.216 

.019 

8.959 

.211 

.019 

9.227 

Substantive index 

.261 

.036 

13.855 

.259 

.032 

12.373 

Adjective index 

.084 

.028 

33.125 

.077 

.026 

33.333 

Pronoun index 

.109 

.044 

40.048 

.110 

.048 

43.541 

Autosemanticity index 

.662 

.151 

22.809 

.654 

.160 

24.312 

Unmomentous word index 

.338 

.151 

44.651 

.346 

.159 

46.119 

Nominal lexicon index 

.345 

.064 

18.541 

.377 

.058 

17.188 
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Discrepancy between corresponding quantitative characteristics for the initial and the 
quantized texts are insignificant. Errors A k , b k , A p , 5 p can be used for comparative 

analysis of automatic text processing software programs regarding their accuracy in 
determining parts of speech and quantitative characteristics. Quantization results in 
compression of the initial text sentences by means of the following methods: exception 
(Y 3 )/ replacement (Y 2 )/ and merging (Y 4 )- Thus quantitative readability characteristics 
of the initial and the quantized texts as well as of their fragments, presented in Table: 6 
and Table: 7, testify reduction of the average sentence length in words in the quantized 
text. Exception is only the quantized text fragments 2 and 3. 

Table: 6 

Quantitative readability characteristics of the initial and the quantized texts 


Quantitative characteristic 

Initial text T 

Quantized text T' 

Average word length in syllables 

2.052 

2.023 

Average sentence length in words 

14.264 

12.614 


Table: 7 

Quantitative readability characteristics of the initial and the quantized text fragments 


Text fragment ■ 

Average word length in syllables 

Average sentence length in words 

Initial text T 

Quantized text T' 

Initial text T 

Quantized text T' 

No. 1 

2.095 

2.153 

23.695 

11.800 

No. 2 

1.990 

1.924 

15.923 

16.957 

No. 3 

2.137 

2.111 

13.9 

31.500 

No. 4 

2.047 

1.925 

21.25 

16.000 

No. 5 

2.147 

1.963 

17.875 

12.000 

No. 6 

2.056 

1.850 

15.765 

15.000 

No. 7 

2.025 

2.074 

9.429 

9.240 


Quantitative variety characteristics of the initial and the quantized texts are presented in 
Table: 8 and Table: 9. Changes of the factors /T lex and as a result of quantization 

procedure are also connected with using the exception (y 3 ), replacement (y 2 )/ and 
merging (y«) methods. 


Table: 8 

Quantitative variety characteristics of the initial and the quantized texts 


Quantitative characteristic 

Initial text T 

Quantized text T' 

^\ex 

.306 

.355 

1/ 

A syn 

.944 

.940 
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Table: 9 

Quantitative variety characteristics of the initial and the quantized text fragments 


Text fragment ■ 


^Mex 


IS 

/v syn 

Initial text T 

Quantized text T' 

Initial text T 

Quantized text T' 

No. 1 

.688 

.831 

.958 

.915 

No. 2 

.628 

.703 

.937 

.941 

No. 3 

.712 

.730 

.928 

.968 

No. 4 

.612 

.738 

.953 

.938 

No. 5 

.706 

.731 

.944 

.917 

No. 6 

.590 

.700 

.937 

.933 

No. 7 

.583 

.636 

.894 

.892 


Lexical variety characterizes information saturation of the text. Reduction of the 
wordform repetition degree is characteristic of the quantized text, in comparison with 
the initial text. Therefore the lexical variety factor for the quantized text is a little higher 
than for the initial text (Figure: 3). 



initial text -*-quantjzed text 


Figure: 3 

Comparative analysis of the lexical variety factor for text fragments 

Syntactic variety shows itself in using various syntactic means: quantization reduces the 
syntactic variety factor. In Figure: 4, syntactic variety factor for the quantized text 
fragments 2 and 3 is higher than for the initial text that indicates necessity of 
requantization of these fragments. ,,, 
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Figure: 4 

Comparative analysis of the syntactic variety factor for text fragments 

Let us analyze changes in formal readability of the quantized text in comparison with the 
initial text. Table: 10 shows the formal readability indexes /?(/, j) and Fr[i , j) for 
corresponding fragments of the initial and the quantized texts. 


Table: 10 

Indexes /?(/, j) and Fr{i, j) for the initial and the quantized text fragments 


Text fragment . 


RU,j) 


Fr(i,j) 

Initial text T 

Quantized text T' 

Initial text T 

Quantized text T' 

No. 1 

2.878 

2.307 

50.199 

62.127 

No. 2 

2.392 

2.360 

66.516 

69.305 

No. 3 

2.442 

3.163 

60.350 

39.007 

No. 4 

2.717 

2.318 

56.182 

70.342 

No. 5 

2.688 

2.118 

54.572 

73.261 

No. 6 

2.462 

2.176 

62.777 

76.150 

No. 7 

1.973 

2.002 

72.860 

70.200 


The formal readability index /?(/,_/) for the quantized text is equal to 2.227, and for the 
initial text it is equal to 2.368 that testifies better presentation of the quantized text. 
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At the same time, comparative analysis of the indexes /?(/, j) for the initial and the 
quantized text fragments (Figure: 5) indicates that the quantized text fragments 3 and 7 
require further improvement. 



Text fragments 

initial text ^-quantized text 


Figure: 5 

Comparative analysis of the readability index /?(/, j) for text fragments 


A similar situation is observed for the Flesh's index as well: for the quantized text, the 
index Fr[i , j) is equal to 68.855; for the initial text, the index Fr(i, j) is equal to 64.966 
that also testify better presentation of the quantized text. 

At the same time, comparative analysis of the indexes Fr(i , j) for the initial and the 
quantized text fragments (Figure: 6) indicates that the quantized text fragments No.3 
and No.7 require further transformation. 
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initial text ^^quantized text 


Figure: 6 

Comparative analysis of the readability index Fr(i,j) for text fragments 

Thus the syntactic variety factor and the formal readability index for the quantized text 
fragments 3 and 7 show that these fragments require requantization of the educational 
information. 

The experiment results allow to draw the following conclusions: 

> Values of the formal readability indexes and for the quantized text are 
better than for the initial text that testifies their better perception by the 
reader. 

> Comparative analysis of formal readability index values and syntactic 
variety factor values for corresponding initial and quantized text fragments 
allows determining quantized text fragments, which require requantization 
of the educational information. 


CONCLUSION 

The considered approach allows taking into account formal characteristics for assessing 
educational text quantization quality. The procedure for obtaining metrics and the 
method for analyzing educational text quantization quality, offered in the article, can be 
used for preparing educational content for LMS. 
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The offered system of quantitative educational content characteristics (Formulas: 1-4) is 
suitable for weakly structured texts. This criteria system is unsuitable for formulas, 
tables, graphic and multimedia objects. 

Taking into account that these objects, as a rule, are not quantizable, the quantitative 
characteristics system (Formulas: 1-4) can be successfully used as a part of automated 
educational content preparation systems 
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